Basic Statistics

Raw Counts

Name Value
Rows 5,000
Columns 31
Discrete columns 14
Continuous columns 17
All missing columns 0
Missing observations 662
Complete Rows 4,928
Total observations 155,000
Memory allocation 2.6 Mb

Percentages

Data Structure

Missing Data Profile

Univariate Distribution

Histogram

Bar Chart (with frequency)

## 8 columns ignored with more than 50 categories.
## case_number: 5000 categories
## date: 2574 categories
## block: 3773 categories
## iucr: 188 categories
## description: 176 categories
## location_description: 82 categories
## beat: 274 categories
## location: 4412 categories

QQ Plot

## Warning: Removed 37 rows containing non-finite outside the scale range (`stat_qq()`).
## Warning: Removed 37 rows containing non-finite outside the scale range (`stat_qq_line()`).

## Warning: Removed 70 rows containing non-finite outside the scale range (`stat_qq()`).
## Warning: Removed 70 rows containing non-finite outside the scale range (`stat_qq_line()`).

Correlation Analysis

## 11 features with more than 20 categories ignored!
## case_number: 4928 categories
## date: 2557 categories
## block: 3717 categories
## iucr: 187 categories
## primary_type: 27 categories
## description: 175 categories
## location_description: 80 categories
## beat: 274 categories
## district: 22 categories
## fbi_code: 22 categories
## location: 4385 categories
## Warning in cor(x = structure(list(id = c(13911205, 13914716, 13911459, 13911033, : the standard deviation is zero

Principal Component Analysis

## 8 features with more than 50 categories ignored!
## case_number: 4928 categories
## date: 2557 categories
## block: 3717 categories
## iucr: 187 categories
## description: 175 categories
## location_description: 80 categories
## beat: 274 categories
## location: 4385 categories
## Warning in plot_prcomp(data = structure(list(id = c(13911205, 13914716, : The following features are dropped due to zero variance:
##  * year